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ABSTRACT 

Cosmological surveys aim to use the evolution of the abundance of galaxy clusters to 
accurately constrain the cosmological model. In the context of ACDM, we show that 
it is possible to achieve the required percent level accuracy in the halo mass function 
with gravity-only cosmological simulations, and we provide simulation start and run 
parameter guidelines for doing so. Some previous works have had sufficient statistical 
precision, but lacked robust verification of absolute accuracy. Convergence tests of the 
mass function with, for example, simulation start redshift can exhibit false convergence 
of the mass function due to counteracting errors, potentially misleading one to infer 
overly optimistic estimations of simulation accuracy. Percent level accuracy is possible 
if initial condition particle mapping uses second order Lagrangian Perturbation The- 
ory, and if the start epoch is between 10 and 50 expansion factors before the epoch 
of halo formation of interest. The mass function for halos with fewer than ~ 1000 
particles is highly sensitive to simulation parameters and start redshift, implying a 
practical minimum mass resolution limit due to mass discreteness. The narrow range 
in converged start redshift suggests that it is not presently possible for a single simu- 
lation to capture accurately the cluster mass function while also starting early enough 
to model accurately the numbers of reionisation era galaxies, whose baryon feedback 
processes may affect later cluster properties. Ultimately, to fully exploit current and 
future cosmological surveys will require accurate modeling of baryon physics and ob- 
servable properties, a formidable challenge for which accurate gravity-only simulations 
are just an initial step. 

Key words: galaxies: halos - methods: N-body simulations - cosmology: theory - 
cosmology: dark matter 



1 INTRODUCTION 

In the vacuum energy dominated cold dark matter cosmo- 



logical model (hereafter ACDM, Komatsu et al. 2011 1, large- 
scale structures form through the amplification of small den- 
sity fluctuations via gravitational instability. At early times 
this amplification can be followed using linear perturbation 
theory of the general relativistic equations of motion for the 
field. At late times, owing to the nonlinearities in the equa- 
tions, and after shell-crossing, the dynamics may only be 
accurately followed using numerical simulations. Overdense 
regions of the density field, whose dynamics have broken 
away from the evolution of the background space-time and 
have reached some state of virial equilibrium are commonly 
referred to as dark matter halos. 

The growth rate of large-scale structures is directly 
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sensitive to the expansion rate of the Universe, and hence 
the cosmological parameters. One can show theoretically, 



through the excursion set formalism (Press & Schechter 



1974 Bond et al. 1991 Sheth & Tormen 19991, that the 



number of halos is also sensitive to cosmological parameters, 
and importantly for future surveys, the presence of "dark en- 
ergy" ^^^St^^^^98^^^^^^^Q01 ~ 



fc Hu||2004[ |Marian fc Bernstein||2006| |Cunha et al.| 



Lima 



2010 



Courtin et al. 20111. This forecast cosmological sensitivity 



has recently been verified through direct testing with A^- 
body simulations ( Smith fc Marian|2011 1. 



The amount of cosmological information that can be ex- 
tracted from cluster number counts is limited by our ability 
to detect signal-to-noise peaks in our observational survey - 
i.e. associate galaxies to groups, identify groups relative to 
an X-ray background noise level, etc. The lower that one can 
push the minimum detectable mass, the more cosmological 
information can be extracted from the survey. This comes 
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under the proviso that one can accurately calibrate the true- 
observable mass relation ship (jLinia Sz Hu 2005 , Marian et al. 
20091 IRozo et al.||2009l |Mandelbaum et al.||2010[ |OguriT 



Takada||2011 |Angulo et al.||2012| ). The numbers of rare ha- 
los are also sensitive to the level of non-Gaussianity in the 
primordial density field due to its effect upon the tail of ex- 
treme density fluctuations ( |Matarrese et al 



2000 Marian 



et al. 20111. Cluster counts are also sensitive to the total 
neutrino mass ( Wang et al.||2005] |Carbone et al.|[2012 Shi 



mon et al.||2012 



e.g. J Thus, surveys that promise to accu- 
rately measure the evolution of the abundance of groups and 
clusters, also have the potential to help probe fundamental 
physics. Accurate theoretical predictions for the cluster mass 
function and its dependence on cosmology, are therefore es- 
sential to fully exploit next generation cluster surveys. 

Current cosmological constraints from clusters come 



from: Vikhlinin et al. (20091; Vanderlinde et al. (2010k; Rozo 



et al.|(|2010[);|Sehgal et al.|(2011[ ); |Allen et al.| ( |26ll[ ); fPlanck 

Collaboration (2011 1. Over the next decade there will be a 



number of large surveys that will aim to strongly constrain 
the cosmological model through the abundance of clusters: 
in the X-ray there will be eROSITA ( |Pillepich et~aT|2012[ ), 
with the Sunyaev-Zel'Dovich method there will be Plancl^J 
in the optical using the weak lensing method there will be 
DES] Euclid ( [Laureijs et~aL]|2011[ ) and LSSl[] Several au- 
thors have estimated the requirements on the theoretical ac- 
curacy of the halo mass function to achieve the statistically 
limited constraints on cosmological parameters. |Wu et ah] 
(20101 point out that, in order to constrain time evolving 



dark energy models for DES, the theoretical mass function 
must be known with an accuracy < 0.5%. 

In this paper we address the question: What are the cor- 
rect numerical parameters needed to achieve percent level 
accuracy in the mass function in a cosmological simulation? 
Large simulation volumes (whether by a single simulation 
or by multiple realizations) are able to reduce statistical un- 
certainties due to finite halo numbers. However, large abso- 
lute volumes are needed to reduce systematic and statistical 
errors associated with poor sampling of large-scale density 
modes (e.g. |Barkana fc Loebp> 004 Bagla fc Ray|2005||Power 



fc Knebe|2006||Reed et al.|26o7||Crocce et al.|2(jlO||Smith fc 



Marian 



2011} . Over the past decade, impressive statistical 



precision in the halo mass function has been achieved using 



suites of cosmological simulations (Jenkins et al.|2001 War- 


ren et al.||2006; Reed et al. 
et al.|2010||Iliev et al.|2012 


|2007| |Tinker et al.|2008; Croccc 
Bhattacharya et al72011; Smith 


& Marianl2011 


Angulo et al.|2012 Alimi et al.|2012| Wat- 


son et al.||2012 


and others). However, statistical precision 



does not imply accuracy, even when considering gravity-only 
simulations. Sources of systematic error include finite sim- 
ulation volume, force resolution, mass resolution and dis- 
creteness effects, time-stepping, halo finding, initial condi- 
tion particle mapping, and start redshift. Recent progress 
includes |Crocce et al.| ( |2010[ | and [Bhattacha rya et al.| ( |2011[ ) , 
who each address many of the systematic uncertainties and 
determine a halo mass function with an estimated accu- 
racy of ~ 2% from a suite of large gravity-only cosmological 
boxes, though their results differ by significantly more than 
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this for halos larger than ~ lO 15 /i -1 M0. Moreover, neither 
approach has taken into account the full covariance matrix 
of mass function estimates when deriving best fit parameters 
( Smith fc Marian|201ll ). 

As a first step on the path toward producing an accurate 
mass function in the observational plane, we limit ourselves 
to demonstrating how percent level accuracy in gravity-only 
(i.e. collisionless) simulations (wherein baryons are present 
but interact only via gravity) can be accomplished. We show 
how to set up initial conditions so that percent level accuracy 
can be achieved. We also isolate and test the run parame- 
ters that control force resolution (force softening and tree 
opening angle) and time-step size, allowing us to determine 
the required values to achieve percent level convergence. Fi- 
nally, we ask: how many particles do we need to sample the 
halo mass distribution, in order to obtain a mass function 
accurate to better than < 1%. 

The paper breaks down as follows: in ij2]we discuss set- 
ting up the initial conditions for the structure formation 
simulations and the parameters used to run the A-body 
codes. In S|3] we describe the suite of A-body simulations 
performed and halo identification. In fj4] we present the re- 
sults for the halo mass function and its convergence with 
simulation parameters. In S|5]we explore the variation of the 
halo mass function with the method for generating the initial 
conditions. We also make a comparison between the results 
obtained from two well known A-body codes. In Sj6]we ex- 
plore the convergence of the matter power spectrum and the 
1-point probability density function of matter fluctuations. 
In iJTJwe discuss the remaining challenges of obtaining better 
than 1% accurate mass functions from structure formation 
simulations. In S|8]we summarize our findings and draw up 
a set of guidelines for obtaining accurate gravity-only mass 
functions. 



2 SIMULATING STRUCTURE FORMATION 
2.1 initial conditions 

In order to set up a simulation, we must first select the cos- 
mological model and the probability distribution of the pri- 
mordial density perturbations. In this study we shall work 
within the context of the ACDM paradigm and assume that 
the initial density modes are described by a Gaussian ran- 
dom field. The statistics of the field are thus fully speci- 
fied by the power spectrum. Hence, a particular realization 
of the density field in Fourier space may be obtained by 
drawing a set of uniform random phases and assigning am- 



plitudes drawn from the Rayleigh distribution (Efstathiou 
et al. 1 1985 1, or through the convolution of white noise with 



a filter that is related to the power spectrum (Bertschinger 



2001) 



A given density field must then be converted into a par- 
ticle distribution, and several techniques for doing this have 
been discussed in the literature (e.g. Efstathiou et al.|1985 



I Scoccimarrol [19981 |Bertschingerl|1999| |2001| |Crocce et~aLl 
|2006p . The traditional approach is to place particles on a 
uniform lattice, and these are then displaced off the initial 
points q using a displacement field *(q) that encodes all of 
the statistical properties of the density field. Hence, initial 
(Lagrangian) and final (Eulerian) positions, x, are related 
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through: 



*(q,r) 



(1) 



where the coordinates x are a solution to the equation of 
motion: 



d 2 x „., n 



(2) 



where in the above dr = dt/a(t) is conformal time, "H = 
aH(a), and $ is the peculiar gravitational potential. The 
solution for ^ is perturbative, and each order can be found 
through iteration with solutions of lower order. In terms of 
the initial density field, and up to second order, the solutions 



may be written (Zel'Dovich 1970 Buchert 


1994 


Buchert 


et al.|1994 Bouchet et al. 1995 Scoccimarro 


19981 




*(q,r) = -D 1 (a)V (? (1) (q) + J D 2 (a)V 9 


^» (2) (q; 


, (3) 



where Di(a) and D2(a) « ~3D'l(a) /7 are the first and sec- 
ond order growth factors suitable for ACDM. The potentials 
(^'(q) and <^ (q) can be found through iteratively solving 
the Poisson equations: 



* (1) (q) 



V^(q) = £ ^(q)^(q)-[^] , 



(4) 
(5) 



where <j> t ij = d 2 <j> / dqidqj . The linear solutions for ^f, with 
0* 2 ' = 0, yield the traditional Zel'Dovich approximation, 
which we refer to as 1LPT, and the second order so lutions, 

|l998| gave a 



Scoccimarro 



with (j) {2) , we refer to as 2LPT 
detailed prescription for implementing 2LPT displacements 
in simulations, and we make use of a slightly modified ver- 
sion of the publicly available code 2LPT|] |Crocce et al. ( 2006 1 
demonstrated that 2LPT reduces numerical "transients" to 
the level where an accurate representation of the halo mass 
function may be obtained for relatively late start times, 
df/di ~ 10, where ai and at are the initial and final ex- 
pansion factors. 

As can be seen from Eq. j3j, in the limit of asymptoti- 
cally high initial redshift 1LPT and 2LPT become equivalent 
since D^^ai) / D2(a,{) <C D\(p,i)/D\(ai). This has led some to 
speculate that, provided one takes the initial start redshift 
to be sufficiently high, then it should not matter whether 
one uses 1LPT or 2LPT. This issue will be investigated in 
detail in ^5] 

Several earlier studies have explored the importance of 
1LPT versus 2LPT initial conditions: iKnebe et all (12009 1 



used Gadget-2 to show that start redshift and 2LPT versus 
1LPT had little effect on internal halo properties, specifi- 
cally testing halo concentration, spin parameter, tri-axiality. 
They also found little dependence on halo mass or the halo 
mass function. However, their results may understate any 
numerical issues because they focused on smaller halos of 
10 10 -10 13 /i _1 M Q where the mass function is not very steep, 
and their statistics were limited due to usin g low-resolution 
N = 256 3 particles. A more recent study by [Jenkins (20101, 
has shown that there is a definite, although weak, depen- 
dence of the subhalo mass function inside Milky- Way mass 
halos on the choice of the initial conditions. 



One last issue, concerning the generation of initial 
Gaussian random density fields, is that some researches have 
advocated the use of the "Hann filter' J ( Bcrtschi nger|200~ I . 
This is an anti-aliasing filter ( Press et al|1992 l, and corre- 
sponds to setting the Fourier density modes that are out- 
side the Nyquist sphere of the simulation, k^y = tvN 1 ^ 3 /L, 
to vanish by multiplying the transfer function by W(k) = 
cos(nk/2kti y ) for k < k^ y and for k > fey The purpose 
of this is to mitigate some of the anisotropies in the forces 
due to the cubical lattice. In i|5]we shall also investigate how 
the presence of such filtering impacts our goal of an accurate 
mass function. 

Note that for some of the simulations where we test for 
parameter convergence, we will also make use of a modified 



version of Grafic-2 ( Bertschinger 2001). These two initial 
condition codes were verified to show identical convergence 
trends with start redshift. 



2.2 A-body codes 

Once we have obtained an initial condition, we then need to 
integrate the equations of motion, Eq. Q . In this study we 
shall make use of two standard A-body techniques: PKDGRAV 
V2.2.12 and Gadget-2. 

PKDGRAV is our primary simulation code for this study, an 
early version of which is described in Stadel (20011. The 



version of the code we use has been MPI parallelized, and 
uses the hierarchical tree data structure to organize the 
individual simulation particles. The gravitational force on 
each particle is calculated using a multipole expansion with 
Ewald summation to replicate the simulation cube as an 
approximation of an infinite periodic universe. The peculiar 
potential around any given particle is obtained from an hex- 
adecapole expansion of the forces. PKDGRAV uses a variable 
time step criterion that is synchronized for global time-steps. 
Particle orbits are integrated with the symplectic leapfrog 
integrator. 

Gadget-2 is a tree-particle-mesh (Tree-PM) code, and full 
details of which may be found inlSpringel (20051. The main 



difference with PKDGRAV is that on large scales it uses Fourier 
based methods to solve for the forces and only uses the tree 
algorithm to solve for forces on small scales. The solution 
for the potential is then obtained through an interpolation 
of the PM and tree forces over the force matching region, 
and typically this is ~ 4 — 5 mesh cells. 

In Sj5]we investigate the mass functions from these dif- 
ferent codes and explore the convergence properties with 
different 1LPT and 2LPT start redshifts. Additionally, we 
aim to determine the typical values for "generic" run pa- 
rameters that are required for percent level convergence. In 
what follows we shall describe parameters that are mainly 
specific to PKDGRAV, but will make reference as to how they 
apply to Gadget-2 or other codes. Gadget-2 parameters are 



tested further in Smith et al. (20121. The run parameters 
that we focus on are: 

Tree opening angle O: The tree opening angle controls 
the accuracy of medium and long range forces. It does this 
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by setting the minimum distance between a given particle 
and tree node below which the tree node will be "opened" . 
Thus the force calculations for a given particle will include 
contributions from entire nodes and or individual particles. 
A discussion of how O relates to other tree types can be 



found in Stadel (20011 



Softening e: In order to avoid excessively large accelera- 
tions and hence excessively short time-steps, the small-scale 
gravitational interactions must be "softened". This makes 
sense for simulations of collisonless systems like CDM, where 
the particles represent large coarse grained elements of the 
microscopic phase space. PKDGRAV and Gadget-2 both use a 
softened kernel: gravitational forces approach zero for spa- 
tially coinciding particles, and become Newtonian at 2e for 
PKDGRAV and 2 .8e for Gadget- 2. PKDGRAV uses the K 3 soft- 
ening kernal of Dehnen (2001 1 while Gadget-2 uses a spline 



kernel. The force softening leads to a minimum resolved spa- 
tial scale. Throughout, we make use of constant comoving 
softening. 

Time-step r\: Each particle is on an adaptive time-step with 
length proportional to the time-step parameter r\. The actual 
time-step length for each particle is based on the magnitude 
of its current acceleration |a|, the softening length e, and the 
time-step parameter r/, in accordance with the relation: 

dt > vVWW) • (6) 

This technique allows significant computational savings in 
cosmological simulations when only a small fraction of par- 
ticles are in dense regions requiring the shortest time-steps. 

In summary, we shall investigate how the halo mass 
function varies with: the initial start redshift; with 1LPT or 
2LPT initial conditions; with Nyquist filtering; we shall ex- 
plore results for two simulation codes; and how variations in 
0, e, 77 affect our results. Besides these, we shall also explore 
finite volume effects and mass resolution. 



3 SIMULATIONS 
3.1 Simulation suite 

We have generated a suite of A-body simulations that are 
designed to explore the accuracy with which we may esti- 
mate the halo mass function and its dependence on how we 
simulate the dark matter (as discussed in fj2|. All of the 
simulations that we have performed evolve N = 1024 3 equal 
mass dark matter particles. We consider periodic cubes of 
size L — 17.625 /i _1 Mpc evolved to z = 10, and cubes of size 
L = 2048 h~ Mpc evolved to z = 0. The relative box sizes 
were chosen so that halos corresponding to ~ 3<r fluctuations 
in the density field are sampled by Ah ~ 1000 particles at 
the final output. This corresponds to M ~ 3.8 x 10 s h~ 1 M Q 
for the small boxes at * = 10, and M ~ 6.1 x 1O 14 /i _1 M at 
z — for the larger boxes. Thus, the final halos in the small 
box simulations are in an evolutionary state similar to the 
clusters in the large box simulations at z = 0. 

Although the L = 17.625 /i _1 Mpc box is very small, the 
effects of finite volume on our study are attenuated because 
we examine halos early, at z = 10, when the typical halo 
mass-scale is still much smaller than the total simulation 
mass. There is no need to apply a finite volume correction 



as in e.g. Reed et al. (20071 to these simulations because 



each of our convergence series utilizes identical initial con- 
ditions and particle displacements. Finite volume effects, to 
the extent that they can be accounted for with a simple 
linear correction technique, are thus identical within each 
convergence series. 

The cosmological parameters that we adopted for the 



et al. 



small box runs were consistent with WMAP5 (Komatsu 
j2009| : fl 
0.705, n s ■■ 



h 



= 0.274, 
0.96, a 8 = 



Q A = 0.726, Q b = 0.046, 
0.812, where these parame- 



ters are the density parameters in matter, vacuum energy, 
and baryons; the dimensionless Hubble parameter; the pri- 
mordial power spectral index; and the variance of the den- 
sity fluctuations on scales of R — 8/1 Mpc. The trans- 
fer function that we used to create the initial conditions 
was produced using the prescription of Eisen stein fc Hu| 
( |1998[ ). For the large box runs, the cosmological parameters 



we adopted were consistent with WMAP7 (Komatsu et al 
20U\ : Qm = 0.2726 
0.963, a s = 



Q A = 0.7274, n b = 0.046, h = 0.704, 
9. The transfer function for these runs 



was computed using CAMB ( Lewis et al.|2000 l. Full details 
for the entire suite of simulations are given in Table [I] 



3.2 Halo identification 

We identify dark matter halos using the friends- of- friends 



(FoF) algorithm (Davis et al. 19851. This links together 



all particles that are separated by less than the linking 
length 6, where 6 is expressed in units of the mean inter- 
particle separation. Throughout, we use b = 0.2, and we use 
the particular implementation of the algorithm internal to 
PKDGRAV; a similar implementation is provided by the code 
f oi[] The estimated iso-density conto ur that the FoF r ecov- 
ers is S = Sp/p = n c b~ 3 - 1 as 81.62 ( |More et al.|2011 1. 

In the literature there is a wide range of alternate ap- 
proaches to the identification of dark matter halos: e.g. the 
spherical over-density (SO) algorithm ( Lacey fc Cole||1994 



Tinker et al. 2008|; 6-D phase space algorithms (Behroozi 
et al.|l2013| >; and for a review of methods see IKnebe et al. 
(20111). Rather than explore all of these different methods 



here, we shall work under the assumption that: an accurate 
FoF mass function implies an accurate mass function for all 
other algorithms designed to select approximately virialised 
objects. Anecdotal support for this is provided by |McBride| 
et al. I ( |2011[ ), who found similar convergence behavior with 
simulation set-up for the FoF (b = 0.2) and SO 200 mass 
functions. We shall reserve the task of establishing the ve- 
racity of this assumption for future work. 

A number of systematic errors have been noted for ha- 
los obtained with the FoF algorithm. Firstly, the recovered 
halo masses are systematically overestimated with respect 
to the "true" halo mas s |Warren et al.| 2006| |Lukic et al.| 
2009||Trenti et al.|2010]|More et al.|2011 |. This owes to the 
fact that the true halo mass distribution is sampled by a fi- 
nite number of particles. This mass overestimation has been 



quantified for spherical mock halos by Lukic et al. ( 2009 1 



and more recently by More et al. (2011 1. Secondly, FoF ha- 
los also experience "bridging", which may occur when two 
distinct halos undergo a close encounter. This systematic 
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Code IC method L[h~ 1 Mpc] m p [h _1 M©] z\ z t 77 e 



PKDGRAV 


1LPT 


17.625 


3.88 x 10 5 


30, 49, 100, 200, 400 


9.8 


0.15 


lm/50 




0.55 




PKDGRAV 


2LPT 


17.625 


3.88 x 10 s 


30, 49, 100, 200, 400 


9.8 


0.15 


im/50 




0.55 




Gadget-2 


1LPT 


17.625 


3.88 x 10 5 


30, 49, 100, 200, 400 


9.8 


0.2 


W30 




0.5 




Gadget-2 


2LPT 


17.625 


3.88 x 10 5 


30, 49, 100, 200, 400 


9.8 


0.2 


U/30 




0.5 




PKDGRAV 


2LPT 


17.625 


3.88 x 10 5 


400 


9.8 


0.15 


W50 




0.4 




PKDGRAV 


2LPT 


17.625 


3.88 x 10 5 


400 


9.8 


0.15 


U/50 




0.68 




PKDGRAV 


2LPT 


17.625 


3.88 x 10 5 


400 


9.8 


0.15 


lm/50 




0.8 




PKDGRAV 


2LPT 


17.625 


3.88 X 10 


400 


9.8 


0.07 


lm/50 




0.55 




PKDGRAV 


2LPT 


17.625 


3.88 X 10° 


400 


9.8 


0.2 


lm/50 




0.55 




PKDGRAV 


2LPT 


17.625 


3.88 X 10 5 


400 


9.8 


0.25 


lm/50 




0.55 




PKDGRAV 


2LPT 


17.625 


3.88 X 10 5 


400 


9.8 


0.3 


lm/50 




0.55 




PKDGRAV 


2LPT 


17.625 


3.88 x 10 5 


400 


9.8 


0.6 


lm/50 




0.55 




F\T/r\ /*! 7-\ ATT 

PKDGRAV 


1 T DT ~ 

lLPl-g 


17.625 


3.88 X 10 


400 


9.8 


0.15 


1 /cn 
tm/50 




0.55 




PKDGRAV 


lLPT-g-HF 


17.625 


3.88 x 10 5 


400 


9.8 


0.15 


U/50 




0.55 




PKDGRAV 


lLPT-g 


17.625 


3.88 x 10 5 


400 


9.8 


0.15 


lm/20 




0.55 




PKDGRAV 


lLPT-g 


17.625 


3.88 x 10 5 


400 


9.8 


0.15 


lm/10 




0.55 




PKDGRAV 


lLPT-g 


17.625 


3.88 x 10 5 


100 


9.8 


0.15 


lm/5 




0.55 




PKDGRAV 


lLPT-g 


17.625 


3.88 x 10 5 


400 


9.8 


0.15 


lm/2 




0.55 




PKDGRAV 


1LPT 


2048 


6.05 x 10 11 


30, 49, 200 


0.0 


0.15 


lm/50 


0.55(2 


> 2),0.7(2 


<2) 


PKDGRAV 


2LPT 


2048 


6.05 x 10 11 


30, 49, 100, 200, 400 


0.0 


0.15 


lm/50 


0.55(2 


> 2),0.7(2 


<2) 


PKDGRAV 


lLPT-g 


2048 


6.05 x 10 11 


400 


0.0 


0.15 


lm/50 


0.55(2 


> 2),0.7(2 


<2) 


Gadget-2 


2LPT 


2048 


6.05 x 10 11 


30, 200 


0.0 


0.2 


lm/20 




0.5 





Table 1. Suite of cosmological simulations. Col. 1: A^-body code used. Col. 2: initial condition method, note that we used the 2LPT 
code throughout, except where there is a -g, which denotes the use of Graf ic-2; -HF denotes use of a Hann filter. Col. 3: box size. 
Col. 4: particle mass. Col. 5: initial condition start redshifts that have been simulated. Col. 6: final redshift. Col. 7: time-stepping 
parameter. Col. 8: Force softening, e, in units of the mean inter-particle spacing i m . Col. 9: tree opening angle (ErrTolTheta for Gadget-2 
runs). The additional Gadget-2 parameters were set to: ErrTolIntAccuracy=0.02, MaxRMSDisplacementFac=0.2, MaxSizeTimestep=0.02, 
MinSizeTimestep=0.000, ErrTolForceAcc=0.005, TreeDomainUpdateFrequency=0.05, PMGRID=1024. 



effect: links unvirialised systems, it acts to reduce the over- 
all number of halos found, and it predominantly occurs for 
the highest mass objects and is stronger at higher redshifts 
( Lukic et al.|2009[> 



Warren et al. (20061 and Bhattacharya et al. (20111 



have provided empirical corrections, determined from cos- 
mological simulations, for the systematic FoF errors. How- 
ever, these empirical models are mass and redshift inde- 
pendent, which may make them insufficient for our goal of 
achieving a ~ 1% accurate mass function. We would expect 
the FoF errors to include dependencies on the specific dis- 
tribution of mass within halos, which depends on both mass 
and redshift. Nevertheless, we assert that the FoF mass over- 
estimation should not affect our convergence tests because 
they are all performed at the same mass resolution. For these 
reasons, we use the raw FoF masses, and remark that this 
will affect our ability to recover an "unbiased", FoF mass 
function. However, this is sufficient to our purposes of quan- 
tifying relative differences in the mass function of different 
simulations. 



4 MASS FUNCTION CONVERGENCE I: 
SIMULATION PARAMETERS 

In this section, we show convergence of the FoF halo mass 
function with varying simulation run and set-up parameters. 



We estimate the number of halos per logarithmic mass inter- 
val, dn(M) /dlogioM , using a Gaussian kernel in log mass. 
The Gaussian kernel is convenient for diagnostics because it 
avoids the 'saw-tooth' pattern that emerges in a binned mass 
function as a result of the discretization of halo masses, par- 
ticularly at low masses where the simulation particle mass is 
a significant fraction of the mass- width of a bin. The number 
of halos with mass Mk is estimated by the sum: 



(7) 



where f g is a Gaussian kernel (in logioM) of width h — 
0.0625, chosen to minimize Poisson fluctuations without in- 
troducing systematic error to the mass function. To mini- 
mize computational cost, we truncate the kernel beyond a 
range of ±a = 3h. The number of halos per logarithmic mass 
interval per unit volume is: 

<HM) = N k 
dlogioM Vh erf(a/\/2) 

where V is the simulation volume. The kernel mass Mk is 
estimated by the following: 



(9) 



analogous to using the average halo mass in a bin. 

Poisson errors can be estimated from the Gaussian ker- 



nel halo numbers Nh, through use of the formula (Heinrich 



© 2012 RAS, MNRAS 000, ppO] 



6 Reed et al. 



2003 utilized for the halo mass function in Lukic et al. 2007 1 

1 



<r± = \ N, 



-A 



(10) 



In what follows, we will show results for all halos with 
more than 8 particles per halo. This is done for purely diag- 
nostic purposes, since the poorly resolved halos are strongly 
affected by particle shot-noise errors. 



4.1 Tree opening angle 6 

The top panel of Figure [I] presents the results from our con- 
vergence study of the tree-opening angle parameter 0, used 
in the code PKDGRAV. These results are for the case of the 
z = 10, L = 17.625 ft _1 Mpc runs. The figure clearly shows 
that, for halos with masses M < 10 9 /i _1 Mq, the runs with 
< 0.68 are converged at the sub-percent level. For ha- 
los with masses M > 1O 9 /i _1 M (N p > 3000), the figure 
shows that scatter in the mass function begins to dominate 
our results. This implies that systematic errors at greater 
than the percent level cannot be ruled out for more massive 
halos, which may be most affected by tree-related criteria. 
In this case, the = 0.8 run deviates modestly from the 
other runs. Larger values of have been shown to cause 
significant direct force errors ( [Stadel|200lj ). 

The increased scatter in the mass function for halos with 
M > 10 9 Mq may seem somewhat surprising, given that 
all of our simulations had the same initial realization of the 
density field so that there is no "sample variance" between 
them. However, even the most accurate particle simulation 
is still essentially a Monte Carlo representation of a mass 
distribution. This means that the mass of each halo has a 
significant uncertainty, which can at best be equal to the 
square root of the number of its particles. The scatter in 
the mass function arises from a convolution of the true halo 
number counts with the scatter associated with simulating, 
sampling, and measuring the masses of halos (see discus- 
sion in f 3.2 1 . Hence it is non-trivial to determine the true 
uncertainty. Fig. [I] shows the expected Poisson errors (long 
thin error bars), and one can see that differences between 
well-converged runs are at sub-Poisson levels. 

For a better estimate of the uncertainty, we create ran- 
domly subsampled l-in-8 particle volumes from the full sim- 
ulation snapshot and measure the 1 — a scatter between the 
FoF mass functions of the subsampled volumes. In Fig. [TJ 
the results of this exercise are denoted by the short, thick, 
red error bars. This scatter tends to be an overestimate of 
the true uncertainty, since the scatter in the FoF mass will 
be larger in the randomly sampled volume due to the smaller 
numbers of particles per halo. For the most massive halos, 
this overestimation may become worse due to the increased 
effects of halo bridging (or unbridging) , which has a large ef- 
fect on halo masses and thus on the inferred mass function. 
Taking these issues into account, we estimate that we are 
sensitive to percent level systematic shifts in the FoF mass 
function for halos with less than ~ 3000 particles. 



4.2 Force softening e 

The central panel of Figure [l] presents the results from our 
convergence study of the force softening parameter e. As can 
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Figure 1. Variation of the halo mass function with simulation 
parameters, relative to the mass function obtained from our fidu- 
cial simulation, as a function of FoF halo mass. All panels show 
the results from the L = 17.625 /i -1 Mpc, PKDGRAV simulations at 
z = 10. All runs used the same initial realization of the density 
field. Long thin error bars show the Poisson errors estimated from 
the number of halos. Short thick red error bars are estimated from 
the scatter of the mass functions of sub-sampled versions of the 
simulations. The top, middle and bottom panels show the results 
of variations in tree opening angle ©, force softening e, and adap- 
tive time-step parameter rj, respectively. The black dashed line 
in the middle panel shows the large (and assumedly undesirable) 
impact of application of the Hann anti-aliasing filter to the initial 
density field, on the halo mass function. Percent level convergence 
is apparent for each run parameter for halos larger than ~ 100 
particles. © 2012 RAS, MNRAS OOO,[T|f20l 
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be seen, the commonly used softening value of e — i m /20 is 
converged at near the percent level over all masses (Z m = 
L/N 1 / 3 ). We also see that the low-mass end of the mass 
function is very sensitive to the choice of e. Halos with N > 
1000 particles appear only weakly dependent on e, provided 
e < Z m /10 and within our statistical limitations. 

For the cases where e > Z m /10, forces do not become 
Newtonian until beyond the FoF linking length of Z m /5. In- 
terestingly, for these large softening lengths, we find that the 
halo abundances at a fixed mass are increased. One possible 
explanation for this effect is that the lower central densi- 



ties of the heavily softened halo profiles l 


Moore et al.|1998| 


Fukushige & Makino||2001| |Power et al. 


l2003||Reed et al.| 


2005 


Tinker et al.|2008 e.g. ) lead to hi 


idler densities near 



the outer edges of halos. This increased outer-shell density, 
implies larger FoF masses as more particles are linked to the 
outer layers ( Bhattacharya et al.||20lT l; (see E 6.2 1 . For the 



smallest halos with few particles, the virial radii and soften- 
ing are of similar order. The minimum resolved halo mass 
thus increases as softening increases (Lukic et al. 2007 e.g. ). 
These issues have implications for a common running mode 
for cosmological simulations where computational speed-up 
at high redshifts is achieved by using a "physical softening" , 
wherein the comoving softening length is scaled by the ex- 
pansion factor, with a typical maximum of f so f t max ~ lOe. 
Effectively, in this mode, the softening is frozen in physical 
coordinates at scale factor a= l/f so ft max- Our tests sug- 
gest that allowing a comoving softening larger than Z m /20 
at high redshifts leads to significant numerical error for the 
early forming halos, which is likely to affect high density 
regions at late times. 

Finally, the black dashed line in the central panel of 
Fig. [I] shows the results of applying the Hann anti-aliasing 
filter. Whilst it may help to correct errors due to the an- 
isotropic lattice structure, it introduces a ~ 30% suppres- 
sion in the number of lower-mass halos relative to the un- 
filtered initial conditions run and relative to the expected 
nearly power-law behavior of the mass function predicted 
from theory ( |Bond et al.|199lj ). Hann filtered and unfiltered 
runs only agree at the percent level for halos with N > 3000 
particles. 



4.3 Time-step size r\ 

The bottom panel of Figure [l] presents our study of how 
variations in the adaptive time stepping parameter 77 affects 
the estimated mass functions. The results demonstrate that, 
for halos with M < 10 9 h~ 1 M e (3000 particles), an increase 
in 77 leads to a decrease in the abundance of halos. We find 
that percent level convergence in the mass function can be 
achieved with n w 0.15 for all halo masses, or 77 ~ 0.2 for 
a 100 particle minimum mass. This value is consistent with 
the value of 77 = 0.2 found by Power et al. (20031, who ex- 



amined the convergence behaviour of the density profiles of 
dark matter halos. This similar converged time-step size is 
not surprising given that low-redshift halo centers consist of 



some of the earliest material to be bound into halos (Die 



mand et al. 20051. For halos with N > 1000 particles, the 
mass function converges with a much longer time-step cor- 
responding to 77 = 0.6. 

Finally, we note that the value 77 = 0.2 for PKDGRAV is 
similar to the default size of the adaptive time stepping in 



Gadget-2: the parameter ErrTolIntAccuracy = 77 2 /2 has a 
default setting of 0.025, which corresponds to 77 = 0.22. 



5 MASS FUNCTION CONVERGENCE II: 
INITIAL CONDITIONS 

5.1 Results: Small boxes at z=10 

Figure [2] compares the behaviour of the 1LPT and 2LPT ini- 
tial conditions, as a function of the start redshift Zj, for the 
small-box simulations at z — 10 evolved with PKDGRAV. The 
top and middle panels show the ratios of the halo mass func- 
tions for different start redshifts with the halo mass function 
obtained from the start redshift Z\ = 800, for 1LPT and 
2LPT, respectively. The bottom panel presents the ratio of 
the 1LPT and 2LPT mass functions for simulations with the 
same start redshift. We see that both the 1LPT and 2LPT 
initial conditions converge to yield the same mass function 
as start redshift increases. However, the convergence prop- 
erties of the 1LPT runs are very slow, whereas the 2LPT 
runs appear to converge much faster. 

For the case of 2LPT, we notice that percent level con- 
vergence can be achieved for halos with at least ~ 1000 par- 
ticles and in simulations that have undergone ai /ai > 10 ex- 
pansions. For 1LPT, the at/m = 80 run (zi = 800) is barely 
converged down to N ~ 1000. For halos, with TV < 1000 par- 
ticles, even the 2LPT mass function is poorly converged with 
Zj for all expansion factors tested. The abundances of small 
halos appear to diminish as start redshift is increased; this is 
apparent in both the 2LPT and 1LPT tests. This suggests 
that N ~ 1000 particles represents a minimum halo mass 
for stability to Z\. 

A curious coincidental feature of the 1LPT initial con- 
dition series is that small halos appear to be converged at 
the ~ 2% level by z; = 200 (except for the largest masses) 
and nearly at the ~ 1% level by Z; = 400 (except for the 
smallest masses). The more accurate 2LPT start redshift 
series confirms that this 1LPT convergence is an illusion. 
With later start redshift, the increased number of halos due 
to more accuracy in the 1LPT initial conditions is offset by 
independent errors that act to decrease the number of ha- 
los, resulting in false convergence. This highlights the fact 
that convergence is a necessary but not sufficient condition 
to guarantee accuracy. Some previous studies that appeared 
to show good mass function convergence with early enough 
1LPT initialization, such as Reed et al. ( |2003[ ) and Reed 
et al. (20071 (Fig. Al), among others, likely also suffered 
from this false convergence. Our larger particle numbers and 
corresponding better statistical uncertainty, combined with 
2LPT comparisons, allow us to make more robust conclu- 
sions. 



5.2 Results: Large boxes at z=0 

In order to confirm that the convergence behavior of the 
small box simulations at z — 10 can be applied to the cluster 
regime at lower redshifts, we present the results from tests 
run to z — in the L = 2 /i -1 Gpc boxes. 

Figure [3] shows the results from the 1LPT and 2LPT 
Zi convergence simulations run with the code PKDGRAV. As 
for the case of the small-box simulations at z = 10, we see 
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Figure 2. Variation in the halo mass function as a function of start redshift, for 1LPT and 2LPT initial conditions using the TV-body 
code PKDGRAV. Top panel, results for 1LPT initial conditions. Middle panel: results for 2LPT initial conditions. Bottom panel: the ratio 
of the 1LPT and 2LPT mass functions approaches unity at high initial start redshifts. All panels show results of the L = 17.625 h _1 Mpc 
box at z = 10.0. The 1LPT series displays 'false convergence' for ~ 100 particle halos. Percent level convergence is met for 2LPT for 
N P > 1000 particles to Z\ = 400 or a t /a; RJ 40. 



that low-mass halos are missing with high start redshifts 
z > 49; the z = "pivot" mass-scale, below which conver- 
gence is poor, is somewhat smaller at N ~ 300 particles. For 
larger halos, the 2LPT mass function is well-converged so 
long as z\ H 49; the z\ = 100 curve deviates from the z\ = 30 
curve at just over the percent level at this mass scale, so 
is marginally statistically consistent at the percent level. A 
striking feature here is that when a high enough start red- 
shift is used that Zel'Dovich and 2LPT initial conditions are 
converged, z\ — 200, serious errors are present in the z — 
mass function. A likely explanation is that with such a high 
Zi, cosmological perturbation amplitudes becomes smaller 
than the effective amplitude of spurious numerical pertur- 
bations. In this Zi — 200 run, spurious halos begin to form 
at very early times, initially dominated by 8 particle struc- 
tures; visual inspection reveals that the spurious halos are 
aligned with the initial grid of particles. The effects of these 
early spurious halos lead to the over-abundance of halos at 
z = 0. This underscores the point that 2LPT initial condi- 



tions are preferable because they allow one to start at lower 
redshift where numerical errors are more controllable. 



We note that pure particle mesh codes may perform 
better with high redshift starts because the PM technique is 
well-suited to following low amplitude linear perturbations. 
A tree code (and also a tree-PM code), however, is subject 
to force errors that may accumulate over time, even if they 
are small, because these errors are correlated with the tree 
structure. Tree code force errors could thus seed spurious 
structures that dominate over real cosmological perturba- 
tions when start redshift is very high (and initial cosmolog- 
ical perturbations are very low). Further, the accumulated 
errors would be expected to worsen if time-step length is 
decreased. The PM code, although it may perform better at 
early times, is limited in spatial resolution by the mesh size, 
which is typically much larger than the softening scale of 
a tree (or tree-PM) code, making it non-ideal for modeling 
the internal properties of halos. 
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Figure 3. Mass functions from the simulations started with 
ILPT and 2LPT initial conditions in the L = 2h~ 1 Gpc boxes 
at z = 0, relative to the simulation started at z; = 30 with 2LPT 
initial conditions. All simulations here were run with PKDGRAV. 
Percent level convergence with 2LPT is found between z\ = 30 
and z\ = 49 runs at ~ 1000 particles. Extremely early starts ( 
a i/ a \ ?J 100) lead to serious errors. 




2 3 
v = S c /a(m, z) 

Figure 4. Ratio of ILPT mass functions with those obtained 
from 2LPT initial conditions as a function of the equivalent halo 
peak height v. The thick solid black and thick hatched red line 
show results from the L = 17.625 /i -1 Mpc boxes at z > 10, and 
the thin green dashed, solid black and hatched red show results 
from the L = 2h~ x Gpc boxes. All results here are for PKDGRAV. 
The ratio of 1LPT/2LPT mass functions is 'universal' in that it 
depends mainly upon df/a; and u, independent of the specific 
halo mass range and redshift, and fit by the dot-dashed line. 



5.3 Transformation to universality 

It has been noted that when halo masses are translated into 
equivalent 'peak-height', i.e., where M — > v(M,a), then the 
mass function takes on a 'universal form 



extrapolated over-density threshold for collapse in the spher- 
ical collapse model, and where <r(M, z) is the variance of 
matter fluctuations on mass scale M — 4ttR 3 p/3. Note also 
that owing to the fact that a(M,a) oc D(a), v oc D _1 (a), 
where D(a) is the linear theory growth factor. 

In Figure [4] we present the ratio of the ILPT with the 
the 2LPT mass functions for the small box at z = 10 and the 
big box at z — as a function of v. We see that the ILPT 
mass function error appears to be relatively independent of 
mass and redshift for equal values of v. This universal behav- 
ior is expected in Press-Schechter formalism wherein mean 
halo formation time depends only upon v, and S c is indepen- 
dent of redshift. We present a fit to the ratio of the ILPT 
to 2LPT mass function for our range of data 1 £ v ^ 5, 



dniLPT/dn2LPT = e 



(11) 



( Sheth fc Tor- 

men|1999| . The peak height is defined through the relation, 
v = 5 c /cr(M, z), where 8 C = 1.686 is the present day linearly 



Percent level convergence between ILPT and 2LPT initial 
conditions at scales relevant for the cluster mass function 
is not achieved until extremely early starts at a{/a\ ~ 200. 
However, such early start redshifts lead to very small initial 
density perturbations, which through the relative increase 
of numerical errors, preclude the codes that we tested from 
being able to accurately follow the growth of structure. 



5.4 Comparison of PKDGRAV with Gadget-2 

The quest for obtaining mass function predictions accurate 
to within one percent requires different iV-body simulation 
codes to provide consistent results at this level. Of course, if 
results disagree at > 1% for any two codes, then one would 
need at least three independent iV-body codes to break the 
degeneracy and so decide which results were correct. Having 
said that, we now compare the initial condition convergence 
series of simulations obtained using PKDGRAV with results ob- 
tained from the widely used iV-body code Gadget-2. Note 
that we have made no attempt to find 'optimal' parame- 
ters for Gadget-2, but instead we have adopted some rather 
generic choices for these. The full list of simulations that we 
have performed with Gadget-2, including the exact choices 
for run parameters, are presented in Table [l] 

In Figure [5] we compare the ILPT and 2LPT initial 
conditions as a function of zi, but this time using Gadget-2, 
plotted here down to the limit TV = 20 particles. We find that 
the results exhibit almost identical behaviour to those ob- 
tained from the PKDGRAV runs. The small difference is that 
the suppression of the mass function at low masses with 
increasing start redshift, apparent in 2LPT runs for halos 
with fewer than N ~ 1000 particles, is slightly milder with 
Gadget-2. This appears to enhance the effect of "false con- 
vergence" of the ILPT Gadget-2 simulations. 

Figure [6] shows the ratio of the mass functions obtained 
from Gadget-2 with those obtained from PKDGRAV. Note that 
we used identical initial conditions in all cases. We find that 
Gadget-2 systematically produces up to a 10% higher mass 
function for low-mass halos (TV < 200) than PKDGRAV. This 
excess abundance slightly increases with increasing start 
redshift. We note that the differing mass functions could 
be a result of differing halo structure, which could lead to 
systematic differences in FoF masses, and does not necessar- 
ily mean the codes are truly producing different numbers of 
halos. We also note that Gadget-2 appears to have several 
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percent fewer halos at N ~ 1000 particles, for the highest 
start redshifts. 

Figure [7] compares 2LPT mass functions from Gadget -2 
and PKDGRAV in the large box at z = 0. This figure shows 
that when the lower redshift start is used, <Xf/ai = 30, the 
two codes generally agree within 2%. However, when high 
redshift starts are used, Of/ai ~ 200, the codes diverge 
from each other and from the true answer - recall Figure [3] 
where we showed that the lower redshift start is converged 
in PKDGRAV runs. 

This code comparison shows that there is a weak sys- 
tematic shift with start redshift between the codes. How- 
ever, the convergence behavior of Gadget-2 with start red- 
shift and of 1LPT versus 2LPT still provides useful veri- 
fication of the PKDGRAV initial condition tests. Ultimately, 
a robust comparison of absolute accuracy between codes 
would require that run parameters for each code are run 
at self-converged values. The code differences are consistent 
with the level of agreement found between these same codes 
in Heitmann et al. ( 2008 1 when considering our improved 



statistics and resolution. Further investigation is warranted 
to reveal whether the differences between the two codes is 
caused by inherent differences between the TreePM and the 
pure tree method or whether they are instead due to the use 
of non-ideal run parameters in Gadget-2. This is beyond the 
scope of this paper and we shall reserve a wider study for 
future work. 



6 CONVERGENCE OF OTHER PROPERTIES 

In this section we consider the sensitivity of other dark 
matter statistics: to the adopted simulation parameters; to 
whether we employ 1LPT or 2LPT initial conditions; and to 
the adopted start redshift. We shall restrict our exploration 
to the mass power spectrum and the 1-point Probability 
Distribution Function (PDF) of dark matter density, as ad- 
ditional diagnostics for determining simulation accuracy. 



6.1 Mass power spectra 

For a finite cubical patch of the Universe, the matter power 
spectrum is defined to be: 



<* tl £ a >= J, (lkl)*k 1 ,k a /^ 



(12) 



where is the volume of the patch and is the dis- 
crete Fourier series expansion of the density field. For equal 
mass dark matter particles, the discrete representation of the 
Fourier space density field can be written ( |Peebles|l 980): 



*(*) = 



5(k) 



Ye 

N 
1 



1 ^ 



d 3 x5(x)e*' x 



(13) 



where N is the number of particles. The matter power spec- 
tra were estimated for each simulation using the standard 
Fourier ba sed methods ( |Smith et aL|2003||jmg|2005||Snuth 
et al. 2008 1: particles and halo centers were inter polated onto 
a 1024 d cubical mesh, using the CIC algorithm (Hockney & 
Eastwood|1988 \ ; the Fast Fourier Transform of the discrete 
mesh was computed using the FFTW libraries; the power 



in each Fourier mode was estimated and then corrected for 
the CIC charge assignment; these estimates were then bin 
averaged in spherical shells of logarithmic thickness. 

Before we proceed to the results, we note that it is not 
necessarily the case that a simulation that yields a < 1% 
accurate halo mass function should also yield a < 1% ac- 
curate matter power spectrum, and vice-versa. Different re- 
quirements for simulation parameters are possible because 
the mass range in our mass functions only involves a small 
fraction of the total mass in the simulation. And because, 
our estimates of the measured power spectra do not extend 
to scales as small as the virial radius of the smallest halos 
considered. 



6.1.1 Variation with simulation parameters 

Figure [8] shows the dependence of the matter power spec- 
trum on the simulation run parameters: (top panel) tree- 
opening angle O; (middle panel) the force softening param- 
eter e; (bottom panel) time-step parameter n, for the N- 
body simulation code PKDGRAV. These results show that the 
estimated power spectra, on large scale^] (k/k[ nn < 10), are 
only weakly sensitive to variations in the choice of (Q, e, rf). 
However, on smaller scales, the power spectra show signif- 
icant deviations. For the force-softening tests, we find that 
for e < i m /5 the spectra appear to be converged at the sub- 
percent level on large scales, with a 'bump' at k/k{ uu « 100 
and a steep drop at smaller scales. This small-scale drop in 
power is consistent with the puffing up of halo cores that 
appears to affect the mass function when softening is large. 
Similarly, for the case of the time-stepping parameter rj, we 
see that for r\ — 0.6 there is a > 1% suppression of power for 
fc/fcfun > 100. This can be attributed to the large time-step 
not being able to follow the rapid changes in the acceleration 
of particle orbits in the cores of halos - and hence the failure 
to capture the complex orbit of particles in dense environ- 
ments. This discussion is limited to PKDGRAV run parameters; 
a detailed study of the dependence of the power spectrum 
on Gadget-2 run parameters can be found in |Smith et al.| 
(20121. 



6.1.2 Variation with initial conditions: small boxes 

Figure [9] shows the variation of the mass power spectra with 
the choice of 2LPT or 1LPT initial conditions, and with the 
adopted initial start redshift for the small box simulations 
at z = 10. The top panel of Fig. [9] shows the results for the 
1LPT initial conditions. We observe that the power spectra 
are only converged on the largest scales. On smaller scales, 
we find that the power increases with increasing start red- 
shift and that the results are almost converged at the percent 
level only after df/a, > 80 expansions. 

The middle panel of Fig. [9] shows the results for the 
2LPT initial conditions. Here we find that simulations that 
were started with z, > 100 are converged for k/k{ un < 90. 
For simulations that possess lower start redshifts we find 



Note that owing to the fact that we are comparing results from 
the L = 17.5fc. _1 Mpc at Z = 10 and L = 2048h~ 1 Mpc boxes at 
z = 0, we shall refer to wavemimbers in units of the fundamental 
frequency fcf un = 2tt/L. 
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Figure 5. Variation in the halo mass function as a function of start redshift, for 1LPT and 2LPT initial conditions using the TV-body 
code GADGET-2. Top panel: results for simulations started with 1LPT initial conditions. Bottom panel: results for simulations started with 
2LPT initial conditions. All panels are for the L = 17.625 /i —1 Mpc box at z = 10.0. Convergence of GADGET-2 runs with initial conditions 
are very similar that found for PKDGRAV (Fig. [2| . 



again a suppression of power, although the effect is much 
reduced when compared to the equivalent 1LPT start red- 
shift. On smaller scales, k/kf nn > 100, we find that the power 
is 1% converged for 100 < z; < 400 (10 < o f /oi < 40), while 
the z = 800 start has up to 2% less power. 

The bottom panel of Fig.[9] presents the ratio of the 
power spectra obtained from the 1LPT simulations with the 
2LPT power spectra, for various start redshifts. We see that 
the convergence of the results from the 1LPT simulations 
with the 2LPT simulations is very slow, and that percent 
level convergence is only obtained for z\ > 800. 

Before continuing, we note that, whilst it appears that 
percent level convergence in the power spectra may be 
achieved between 1LPT and 2LPT for very high start red- 
shifts, we have already shown in fj5] that such high start 
redshifts are too early to produce an accurate mass func- 
tion, owing to numerical noise. We are therefore cautious 
about such convergence. 

6.1.3 Variation with initial conditions: large boxes 

We now repeat the same set of tests as done for the pre- 
vious sub-section, only this time we now consider the L = 
2048 /i _1 Mpc simulation cubes at z = 0. 

Figure |10| bottom panel, shows that the 1LPT and 
2LPT initial matter power spectra, measured at z = Zi, are 
converged at the 1% level with respect to each other for the 
same start redshift. However, as indicated in the top panel 
of Fig.|10| the evolved spectra started with 1LPT are not 



converged. On the other hand, the simulations started with 
the 2LPT initial conditions, appear to be converged at the 
1% level for Of/fli < 100, except perhaps at the smallest 
scales, even for start redshifts as low as z; ~ 30. We note 
that the evolved 1LPT versus 2LPT simulations, started 
with Zi = 200, are significantly discrepant with respect to 
the other results. As for the case of the mass function, we 
conjecture that this start redshift is too early for the code 
PKDGRAV to produce an accurate integration of the equations 
of motion. This reinforces our earlier findings that 1LPT ini- 
tial conditions are inadequate for accurate simulations. 

Several earlier studies have investigated the importance 
of 1LPT/2LPT initial conditions on the matter power spec- 
trum ( Crocce et al. 2006 Heitmann et al. 2010 1. Our findings 



are broadly consistent with these studies. However, |Heit-| 
mann et al. (20101 advocated that 1LPT initial conditions 
started from z\ = 200 would lead to better than 1% preci- 
sion matter power spectra. Clearly such a statement is code 
and run parameter dependent, and one should be careful of 
increased numerical errors that may allow consistency be- 
tween 1LPT and 2LPT while still resulting in inaccurate 
power spectra. 



6.2 1-point PDF of matter fluctuations 

We now investigate the impact of simulation parameters and 
1LPT versus 2LPT initial conditions on the 1-point PDF. 
At high densities, the 1-point PDF is a useful probe of the 
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in the L = 2/i _1 Gpc box. The agreement with Gadget-2 using 
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with cj = 30. Extremely early starts ( af/oj > 100) lead to serious 
errors independent of the simulation code used. 



central regions of dark matter halos, reflecting many prop- 



erties of a "stacked" halo density profile (e.g. Scherrer & 
Bcrtschinger 19911. There are some technical subtleties as 



to how one computes the 1-point PDF, since it requires an 
estimator for the matter density at a given point. The pro- 



cedure of estimation in general requires one to smooth the 
particle distribution and hence the results depend up on 
adopted smoothing scale (see for example |Watts fc Taylor] 
20011. Here we have chosen to compute the 1-point PDF 



with a 64 particle nearest neighbor kernel. This operates in 
a similar way to the SPH-kernel and constitutes an adaptive 
smoothing scale. 

Figure [TT| top panel, shows the variation of the 1-point 
PDF with the tree-opening angle parameter 0. We note that 
the most significant changes are in the regions of highest 
density, though sensitivity to <d, beyond the statistical fluc- 
tuations, is relatively low. Figure |11| middle panel, shows 
the variation of the 1-point PDF with the force softening e. 
This clearly shows that the effect of too large force softening 
is to damp the density distribution in the highest density 
inner regions of dark matter halos. This "puffs up" halos, 
which may explain the increased abundances of lower mass 
halos with increased softening length. We also note that as 
e = Z/50 the dense regions appear to be again suppressed. 
This we attribute to violent two-body encounters that can 
evaporate halo cores. Figure |11| bottom panel, shows the 
variation of the 1-point PDF with the time-stepping param- 
eter rj. We see that the results are well converged provided 
V i$ 0.15. The suppression of the high density PDF for large 
rj reinforces our earlier speculation, that if r\ is too large, 
then the particle orbits in the cores of halos can not be inte- 
grated sufficiently accurately, damping the densities in the 
inner regions of dark matter halos. 

Figure|i"2"lpresents the 1-point PDF for 1LPT and 2LPT 
initial conditions for various start redshifts. As can be clearly 
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with the simulation run parameters for the ./V-body code PKDGRAV. 
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Figure 10. Top panel: Ratio of the evolved z = matter power 
spectra for runs with various initial start redshift 2j. The evolved 
1LPT and 2LPT power spectra from the early z; = 200 start 
are very similar to each other, but they lie well above the con- 
verged results of lower redshift 2LPT starts. Bottom panel: Ini- 
tial ratio of the 1LPT to 2LPT matter power spectra at two 
selected initial start redshifts. Differences in 1LPT and 2LPT 
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simulation evolves. The simulations of this figure are large box 
L = 2048/i _1 Mpc simulations run with PKDGRAV. 



seen, the results for the 1LPT initial conditions converge 
very slowly with start redshift. We also note that the both 
the high- and low-density regions appear to be less dense for 
the simulations that were started with low Zi. For the 2LPT 
simulations, convergence is reached at much lower start red- 
shifts, roughly af/di = 10 expansion factors of the cube (i.e. 
around z; ~ 100). 

We note all of the converged parameter values, are 
broadly consistent with those that we identified for the mass 
function in ij4] though variations in the PDF at the highest 
densities are generally larger than 1%. 



7 DISCUSSION: REMAINING CHALLENGES 
FOR < 1% ACCURATE MASS FUNCTIONS 

In this section we discuss the remaining challenges that we 
will have to face in order to approach better than 1% accu- 
rate dark matter halo mass functions. 



7.1 Mass resolution 

In the suite of tests above, we have seen indirect evidence 
that halos with fewer than N ~ 1000 particles, are unlikely 
to be useful for deriving high accuracy estimates of the mass 
function. This suggests that there is a critical mass resolu- 
tion, below which systematic numerical errors are difficult 
to control. Interestingly, this resolution is somewhat worse 
than the N ~ 300 particle resolution limit expected for a 
pure particle-mesh (PM) code with mesh spacing equal to 
the initial inter-particle separation (Lukic et al. 20071. It 



is however still better than the more conservative value of 
N ~ 2000 particles proposed by Bhattacharya et al. (2011 1. 
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Figure 11. Variation of the 1-point PDF of dark matter den- 
sity fluctuations with the simulation parameters: Top panel, tree- 
opening angle ©; Middle panel, force softening e; and Bottom 
panel, time-stepping parameter r). All results here were obtained 
from the small L = 17.625 /i —1 Mpc simulations at z = 10 us- 
ing the tree-code PKDGRAV. Sensitivity to numerical parameters is 
greatest at the highest densities. 
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Additional support for our claim, comes from the work of 



Trenti et al. ( 2010j), who show through mass-resolution tests, 
that on a halo-by-halo basis, halo masses with N < 1000 
particles are systematically too low. Our statistical limita- 
tions mean we can not rule out the possibility that halos 
resolved with more N > 3000 particles might be accurate 
over a larger range in at/ai than what we find for smaller 
halos. In a subsequent paper, we will examine directly the 
mass resolution convergence. 

If the critical resolution limit of N ~ 1000 particles for 
accurate halo statistics is upheld, then this suggests that the 
tree code technique may not have much advantage over the 
PM code technique in recovering an accurate gravity-only 
mass function. Of course the higher force resolution for tree 
codes enables better modelling of higher density regions. 



7.2 Statistical precision 



The full-sky volume out to z = 2 is V M ~ 200 h~ A Gpc d . This 
sets the requirement on the minimum simulation volume 
needed to replicate the survey volume accessible to future 
cluster surveys. This would correspond to a single simula- 
tion cube with a side length of roughly L > 3.67/i _1 Gpc, 
or if one wants to replicate the full sky-light cone, then one 
would require a cube of length L ~ 8/i~ 1 Gpc. Obviously 
performing such a huge simulation with sufficient mass res- 
olution to obtain N ~ 1000 particles per halo, for all halos 
with masses M > 1O 13,5 /i -1 M0 would be very challeng- 
ing prospect. Such total volumes may be cheaply covered 
by combining the results from many smaller volume simu- 
lations. However, individual simulation boxes must be large 
enough to avoid systematic errors due to mode-discreteness 
near the box-scale and due to the lack of super-box scale 
power (a "DC-mode" can help for multiple realization en- 
sembles |Sirko|2005| >. 

An estimate of the minimum box size required to avoid 
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Figure 13. Estimates of the relative dependence of the mass 
function on simulation cube length compared to the case of an 
infinite cube. The dependence of the mass function on L is cal- 
culated by assuming that the power on scales k < fc tun = 2n/L is 
zero. The bottom panel is the expected Poisson error of the cumu- 
lative halo number count within a single simulation volume. Pre- 
dictions were obtained using the mass function of Bhattacharya 
|et al.||20TT) . 



suppressing massive halo formation can be made by comput- 
ing the effect of missing power at wavelengths larger than 
the box on the (empirically fit) analytic form of the mass 
function. Figure [13] (top panel) demonstrates that a simu- 
lation box of roughly L ~ 2/t~ 1 Gpc box should be able to 
capture the mass function at sub-percent level for all halos 
with masses M < 10 16 /i _1 M e at z = 0. However, m a sin- 
gle realization of such a volume, statistical accuracy will be 
much lower; Poisson errors remain larger than 1% well below 
10 15 /i _1 Mq. One caveat is that this calculation underesti- 
mates the finite volume effects due to the discreteness of 
modes near the box scale (see for example Reed et al.|2007 
|Smith fc Marian|201l| . 



7.3 Verification of absolute accuracy 

The task of verifying that we have actually obtained the 
true answer is somewhat circular, since it implies that we 
know already what the true answer is. On our path toward 
the true answer, we should consider the possibility that our 
simulations suffer from some level of false convergence. One 
obvious approach to addressing this issue will be to verify 
that independent simulation codes give the same results at 
the desired level of accuracy. However, this does not take 
into account the pernicious systematic errors, such as the 
false convergence with redshift that we observed for the 
1LPT simulations. In the future, a more complete approach 
would provide us with a theoretical framework for objec- 
tively quantifying 'accuracy' and enable us to identify direc- 
tions in parameter space that would allow us to approach 
our desired goal. 

There are a number of systematic errors that one will 
need to characterise in detail. One in particular is that as- 
sociated with the coarse graining of phase space. Spurious 
perturbations related to mass discreteness may lead to the 
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collapse of small structures around lattice points (Melott 
1990||Joyce fc Marcos|2007b| |Joyce et al.|2009| >. T his effect 



is well-known for Warm Dark Matter simulations (Wang & 



White 2007 Schneider et al. 20121, and it therefore must 



also be present in CDM simulations. Other effects relating 
to mass discreteness in particle codes have been discussed 



by a number of authors (e.g. Splinter et al.|1998 Joyce & 



Marcos|2007a| |Romeo et al.|2008 l 

We have attempted to steer away from making direct 
statements regarding whether particular papers had errors 
and by how much because quantifying the accuracy of the 
works of other authors would require repeating their simu- 
lations with exactly the same codes and run parameters. A 
number of widely used fits in the literature used 1LPT initial 
conditions, many of them with a lower start redshift than 
expected to be required for good agreement between 1LPT 
and 2LPT ICs. In those systematic under-abundance 

of the halo mass function, especially at high masses or high 
redshifts is implied. 

An informative comparison can be made from the fact 
that the fitting formulae of multiple authors are in reason- 
able agreement with each other. For example, there is agree- 
ment at the 2 — 3% level between the FoF mass function 
fits of |Crocce et al.| §20l6\ using 2LPT ICs), |Bhattacharya| 
et al.| ( |2011| using a mix of 2LPT and 1LPT with a high 
start redshift) and |Angulo et a~ (2012 using 2LPT), over 
a wide mass range. Also, there is ~ 5% agreement between 



restriction may become less severe as the mass resolution 
of the simulation is increased. The logic being that more 
cosmological power at small scales may enable earlier start 
redshifts to be simulated without leading to an increase in 
the amount of spurious structure. And, as shown in § [5] the 
mass function sensitivity to start redshift is smaller for halos 
resolved with more particles. The upper limit of the allowed 
start redshift range may be code dependent. For example, 
as we discussed in §5.2| a particle-mesh technique may al- 
low higher start redshifts, but typically comes at the cost of 
worse force resolution. 

The narrow range in implied af/ai presents a challenge 
for simulations with very large dynamic range, wherein it 
would be difficult to model accurately the mass function 
of massive cluster halos forming at z ~ 0, while also cap- 
turing accurate evolution of the early generations of dwarf 
galaxy halos forming already at z ~ 10. Even though the 
fraction of mass assembled into galactic halos at such early 
times is small, early galaxy formation occurs preferentially 
in Lagrangian regions where clusters will later form. These 
early-forming galactic halos should be modeled accurately 
because their feedback processes could have have significant 
effects on the eventual baryon and total cluster mass content 
through energy injection and preheating, which could begin 
very early (e.g. Benson & Madau 20031. 



the Tinker et al. ( |2008| 1LPT ICs) and the recent [Watson 7.5 Impact of baryons 



et al. (2012 1LPT ICs with a higher start redshift) spherical 
over-density mass function fits. There is a caveat that such 
comparisons are only useful to the extent that the different 
studies do not suffer from the same systematic errors. 

If we consider the widely-used |Tinker et al.] ( |2008[ ) mass 
function, the authors stated statistical accuracy of 5% is 
comparable to the systematic error, estimate from Eqn. |11[ 
that we would expect from their results due to their use 
of 1LPT initial conditions with af/oi ~ 50, although this 
error would approach 10% at the highest masses, while being 
smaller at lower masses ( < 1O 15 /i _1 M ). They point out 
some dependence on start redshift between some of their 
simulations, and exclude from their fit those with the lowest 
start redshifts due to a systematic under-abundance of halos. 
We can thereby deduce that the systematic errors in that 
study due to initial conditions are generally within their 
quoted statistical accuracy. 



7.4 Narrow scale factor range for accuracy 

In S|5] we showed that significant errors are introduced in 
the halo mass function when the total expansion of the box 
lies outside the limits (10 < at/ai < 50). Of course, one 
would expect that the epoch of a particular halo's forma- 
tion is more important than final simulation output for as- 
sessing whether that halo, and by extension, the mass func- 
tion, has been modelled accurately. A brief supporting ar- 
gument is that any errors introduced to early structures are 
unlikely to "evolve away", though suppression of the num- 
ber of early forming halos may become less noticeable, at 
fixed mass, after more halos have formed at lower redshifts. 
The implication is that halos in any particular cosmologi- 
cal simulation can be modeled accurately only for a range 
in formation redshifts of Ai+ Z ~ 5. It is possible that this 



We have purposely ignored the important effects of baryons 
on the halo mass function. Recent hydrodynamic simula- 
tions have shown the range of plausible baryon effects on 
total cluster halo masses to be up to ~ 15% within the radius 
enclosing 500 x the critical density of the universe ( |Stanek| 
et al. 2009). Even for "adiabatic simulations" wherein gas 



cooling, star formation and feedback are ignored, baryons 
may have up to ~ 7% effects on the halo mass function ( |Cui| 
et al.|2012| >. 

Baryon influences thus present a serious challenge for 
percent level accuracy in the mass function required for 
planned dark energy missions. However, accurate gravity- 
only simulations, as we have explored in this work, are a pre- 
requisite for future simulations with more complete baryon 
physics aiming to obtain percent level accuracy in the mass 
function or other properties. To save computational cost, 
cosmological constraints might rely on a combination of 
baryon and gravity-only simulations. For example, hydro- 
dynamic simulations of limited cosmological volumes could 
be used to calibrate the effects of baryons on halos as well 
as to derive relations between halos and observable proper- 
ties. Then, large volume gravity-only simulations might be 
utilized to map the dependence of halo numbers and other 
properties on cosmology. 



7.6 Calibrating mass— observable relationship 

A further difficulty in using the halo mass function for cos- 
mology is that any practical halo definition one might use, 
such as through the FoF or SO algorithms, typically have 
no direct observable counterpart. For example, the emission 
from X-ray clusters is determined by the (square of) the 
gas density distribution. One might argue that this leads to 
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more spherical clusters the abundance of which may be bet- 
ter matched to SO halos in simulations, rather than to FoF 
halos (see discussion in Kravtsov fc Borgani|2012 I. Though 



weak lensing may help to calibrate halo masses ( Marian et al 



2009 Becker & Kravtsov 2011 Mandelbaum et al. 20101, 



optical, x-ray, and Sunyaev Zel'dovich cluster masses have 



large scatter and systematic uncertainties (Angulo et al 



20121 



Mock observations from simulations may thus be a su- 
perior means of obtaining a more accurate mass function 
in the observable plane, especially once baryon properties 
are better modeled. Ultimately, an observationally useful 
and accurate (cluster) mass function will involve modeling 
baryon physics and observable properties (or mock observa- 
tions) and represents a formidable challenge for cosmology 
- a challenge that must be solved in order to fully exploit 
future and even current cluster surveys of the Universe. 



8 CONCLUSIONS AND GUIDELINES FOR 
ACCURATE SIMULATIONS 

In this paper we have explored the dependence of the mass 
function of dark matter halos on simulation run parameters 
and initial conditions. Our aim has been to perform conver- 
gence tests that will illuminate the path to obtaining percent 
level accuracy in this statistic. This will be a requirement 
for future cluster surveys of the Universe that aim to help 
constrain the nature of dark energy or dark gravity. 

In i|2]we gave a brief overview of the simulation method, 
paying special attention to how one sets up initial con- 
ditions, either using Zel'Dovich approximation (1LPT) or 
2LPT. We described the simulation codes that we have em- 
ployed PKDGRAV and Gadget-2, with the former being the 
main code used throughout this study. 

In [j3]we described the large suite of iV-body simulations 
that we have performed to study these problems. All simu- 
lations were run at N = 1024 3 and we covered two regimes: 
high redshift (z = 10), small scale (L = 17.625 ft _1 Mpc) and 
low redshift (z = 0) large scale (L = 2ft~ 1 Gpc). 

In S|4] we explored the dependence of the mass func- 
tions on the simulation parameters and found the follow- 
ing: the resultant mass functions were rather insensitive to 
the choice of the tree-opening angle, provided O < 0.7; the 
results for halos resolved with fewer than N ~ 1000 par- 
ticles were sensitive to the choice of force softening, with 
larger values tending to increase the abundance of halos in 
this regime; results were fairly insensitive to the size of the 
adaptive time-step parameter and that 1% converged results 
could be achieved for r\ < 0.15. We also demonstrated that 
the use of anti-aliasing filters, such as the Hann filter, to set 
up initial conditions, can lead to ~ 30% suppression in the 
abundance of halos resolved with N < 1000 particles. We 
do not advocate the use of the Hann filter, since there is no 
aliasing in the initial conditions to correct. 

In Sj5] we performed a detailed study of the impact of 
the choice of initial conditions on the mass function. We 
found that the results from simulations that are initialized 
with 1LPT converge very slowly as the start redshift is in- 
creased. The effect of too low a start redshift being the 
suppression of the formation of high mass halos. Further- 
more, for the large box simulations, we also found simula- 
tions started at very high redshifts Zi > 200 would fail to 
correctly follow the build up of structure due to the rela- 
tive increase in numerical noise. Furthermore, 1LPT initial 
conditions exhibit "false convergence" with increasing start 
redshift. Simulations starting from 2LPT initial conditions 
proved to have very good convergence properties and for 
simulations that underwent 10-50 expansion factors, yielded 
percent level convergence in the halo mass function at the 
1024 3 resolution of our tests. We made a direct comparison 
of these results obtained from integration of the initial con- 
ditions with the tree-code PKDGRAV with results from the 
Tree-PM code Gadget-2, and found almost identical be- 
haviour. However, a detailed comparison of the mass func- 
tions from the two codes revealed that Gadget-2 produced 
a ~ 10% increase in the mass function for halos resolved 
with N < 10 2 particles. These results extend and support 
the earlier findings of ( Crocce et al.|2"o"06 l. 

In [J6] we explored the convergence properties of two 
other statistics of the density field, namely the matter 
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Table 2. Approximate run and initial condition parameters that permit percent-level simulation convergence in the halo mass function 
extracted from gravity-only simulations. 2LPT initial conditions are required. Run parameters 0, e, and r\ are the tree-opening angle, 
force softening, time-stepping parameter. t)q denotes the Gadget-2 time-step parameter ErrTolIntAccuracy = rp /2. fitf/ai is the ratio of 
initial to final scale factor at which halo properties are to be considered. L is an estimate of the minimum box length needed to avoid 
systematic errors in the mass function while Vn is the comoving volume of the universe accessible to future cluster surveys. 



power spectrum and the 1-point probability density function 
(PDF) of matter fluctuations. We found that the simulation 
parameters that produced < 1% convergence in the mass 
function would also lead to good convergence behaviour in 
these statistics. In addition, too high a start redshift for 
either 1LPT or 2LPT initial conditions would lead to sys- 
tematic errors. On the other hand, the results from the sim- 
ulations run with 2LPT initial conditions demonstrated ex- 
cellent convergence behaviour. 

In summary, Table [2] presents a general recipe for the 
parameters needed for percent accuracy of the mass function 
within a gravity-only simulation using a tree code. Except 
for the tree opening-angle O, which has some dependence on 
the specific tree used, all the other run parameters can be ap- 
plicable to other tree codes. This list shows required values 
but is not complete. In future work, one would expect this 
table to be extended to include the following: if PM forces 
are used for large scale force computation, then parameters 
controlling their accuracy, such as the size of the PM grid 
should be included; multipole expansions are used to com- 
pute the tree forces, and different codes use different orders: 
which order is sufficiently accurate for our purposes? Also, 
there should be some entry associated with the parameters 
that control the halo finder (halo definition). 

Ultimately, inferring cosmological parameters from the 
cluster mass function will require a number of other issues 
to be solved relating to baryons and observable properties. 
Among the difficulties that baryons pose is the gravitational 
coupling of baryon processes to dark matter ( |Somogyi fc| 



Smith 2010 van Daalen et al. 20111. Inferring observable 



properties from the simulations for comparison via mass- 
observable relations or by direct mock catalogs is a further 
complexity. Thus, percent level accuracy in numerical sim- 
ulations represents a formidable challenge, but one that we 
must meet if future surveys of the Universe are to live up to 
their potential. 
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